Picture for Victor Shea-Jay Huang

Victor Shea-Jay Huang

Action Emergence from Streaming Intent

Add code
May 14, 2026
Viaarxiv icon

Driving Intents Amplify Planning-Oriented Reinforcement Learning

Add code
May 14, 2026
Viaarxiv icon

MindVLA-U1: VLA Beats VA with Unified Streaming Architecture for Autonomous Driving

Add code
May 14, 2026
Viaarxiv icon

The Side Effects of Being Smart: Safety Risks in MLLMs' Multi-Image Reasoning

Add code
Jan 20, 2026
Viaarxiv icon

JPS: Jailbreak Multimodal Large Language Models with Collaborative Visual Perturbation and Textual Steering

Add code
Aug 07, 2025
Viaarxiv icon

Lumina-mGPT 2.0: Stand-Alone AutoRegressive Image Modeling

Add code
Jul 23, 2025
Figure 1 for Lumina-mGPT 2.0: Stand-Alone AutoRegressive Image Modeling
Figure 2 for Lumina-mGPT 2.0: Stand-Alone AutoRegressive Image Modeling
Figure 3 for Lumina-mGPT 2.0: Stand-Alone AutoRegressive Image Modeling
Figure 4 for Lumina-mGPT 2.0: Stand-Alone AutoRegressive Image Modeling
Viaarxiv icon

How Should We Enhance the Safety of Large Reasoning Models: An Empirical Study

Add code
May 21, 2025
Viaarxiv icon

Vision-to-Music Generation: A Survey

Add code
Mar 27, 2025
Viaarxiv icon

TIDE : Temporal-Aware Sparse Autoencoders for Interpretable Diffusion Transformers in Image Generation

Add code
Mar 10, 2025
Viaarxiv icon

Towards Precise Scaling Laws for Video Diffusion Transformers

Add code
Nov 25, 2024
Figure 1 for Towards Precise Scaling Laws for Video Diffusion Transformers
Figure 2 for Towards Precise Scaling Laws for Video Diffusion Transformers
Figure 3 for Towards Precise Scaling Laws for Video Diffusion Transformers
Figure 4 for Towards Precise Scaling Laws for Video Diffusion Transformers
Viaarxiv icon